Skip to content

Conversation

@mbutrovich
Copy link
Contributor

@mbutrovich mbutrovich commented Oct 22, 2025

What issue does this PR close?

Partially address #1749.

Rationale for this change

This PR fixes a bug in delete file loading when a FileScanTask contains both positional and equality delete files. We hit this when running Iceberg Java test suite via Comet in apache/datafusion-comet#2528. The tests that failed were

TestSparkExecutorCache > testMergeOnReadUpdate()
TestSparkExecutorCache > testMergeOnReadMerge()
TestSparkExecutorCache > testMergeOnReadDelete()

The Bug:
The condition in try_start_eq_del_load (delete_filter.rs:71-73) was inverted. It returned None when the equality delete file was not in the cache, causing the loader to skip loading it. When build_equality_delete_predicate was later called, it would fail with "Missing predicate for equality delete file".

What changes are included in this PR?

The Fix:

  • Inverted the condition so it returns None when the file is already in the cache (being loaded or loaded), preventing duplicate work across concurrent tasks
  • When the file is not in the cache, mark it as Loading and proceed with loading

Additional Changes:

  • Added test case test_load_deletes_with_mixed_types that reproduces the bug scenario

Are these changes tested?

Yes, this PR includes a new unit test test_load_deletes_with_mixed_types that:

The test would fail before this fix and passes after.

@mbutrovich mbutrovich changed the title fix(reader): Support both position and equality delete on the same FileScanTask fix(reader): Support both position and equality delete files on the same FileScanTask Oct 22, 2025
@mbutrovich mbutrovich marked this pull request as ready for review October 29, 2025 19:49
@liurenjie1024
Copy link
Contributor

hi, @mbutrovich please help to resolve conflicts.

# Conflicts:
#	crates/iceberg/src/arrow/caching_delete_file_loader.rs
@mbutrovich
Copy link
Contributor Author

hi, @mbutrovich please help to resolve conflicts.

Done. Thank you for your help with reviewing!

mbutrovich added a commit to mbutrovich/iceberg-rust that referenced this pull request Nov 4, 2025
@liurenjie1024 liurenjie1024 merged commit fd08916 into apache:main Nov 5, 2025
16 checks passed
@mbutrovich mbutrovich deleted the pos_and_eq_deletes branch November 5, 2025 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants